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ABSTRACT 

Galaxy formation inside dark matter halos, as well as the halo formation itself, can be affected 
by large-scale environments. Evaluating the imprints of environmental effects on galaxy clustering is 
crucial for precise cosmological constraints with data from galaxy rcdshift surveys. We investigate 
such an environmental impact on both real-space and redshift-space galaxy clustering statistics using a 
semi-analytic model (SAM) derived from the Millennium Simulation. We compare clustering statistics 
from original SAM galaxy samples and shuffled ones with environmental influence on galaxy proper- 
ties eliminated. Among the luminosity-threshold samples examined, the one with the lowest threshold 
luminosity (~ 0.2L*) is affected by environmental effects the most, which has a ~10% decrease in 
the real-space two-point correlation function (2PCF) after shuffling. By decomposing the 2PCF into 
five different components based on the source of pairs, we show that the change in the 2PCF can be 
explained by the age and richness (galaxy occupation number) dependence of halo clustering. The 
2PCFs in redshift space are found to change in a similar manner after shuffling. If the environmental 
effects are neglected, halo occupation distribution modeling of the real-space and redshift-space clus- 
tering may have a less than 6.5% systematic uncertainty in constraining crgf2J^ from the most affected 
SAM sample and have substantially smaller uncertainties from the other, more luminous samples. We 
argue that the effect could be even smaller in reality. In the Appendix, we present a method to 
decompose the 2PCF, which can be applied to measure the two-point auto-correlation functions of 
galaxy sub-samples in a volume-limited galaxy sample and their two-point cross-correlation functions 
in a single run utilizing only one random catalog. 

Subject headings: galaxies: formation — galaxies: halos — large-scale structure of universe — cos- 
mology: theory — dark matter 



1. INTRODUCTION 

Recently, many authors have identified the environ- 
mental impact, which manifests itself as another degree 
of freedom, on the clustering of halos at fixed mass. 
iGao et all (|2005h found that low mass halos [M < -M*) 
which formed earlier are more strongly clustered than 
their younger counterparts; whilst for high mass halos, 
older halos with M > 1 0M* turn out to be le s s clustered 
than the younger ones (|Wechsler et all 120061 : fring et all 
l2007tlWetzel et al.ll2007t) . where M* is the nonlinear mass 
scale for collapse. This environment al dependence of hal o 
clustering, namely "assembly bias" (|Gao fc Whitdl2007D . 
contradicts with excursion set theory (EST: iBond et al.1 
119911: lLacev fc CoIelll993t iMo fc Whit ell 19961) which pre- 
dicts that an individual halo evolves without awareness 
of the larger environment except for its own mass when 
it is observed (jWhitell 199(1 . In this paper, we investigate 
the effect of the halo assembly bias on modeling the real- 
space and redshift-space galaxy clustering statistics and 
discuss the possible consequence on cosmological param- 
eter constraints from these clustering statistics. 

Several possible explanations are proposed to decode 
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halo assembly bias by either studying detailed halo 
growth within high-resolution Y-body si mulations or 
by im proving current excursion set theory. I Wang et alJ 
(|2007f ) showed that the accretion of low mass halos in 
dense regions is severely truncated due to tidal dis- 
rupt ion and pr e heatin g by their massive companions, 
and Uing et all (|2007f l suggested that the competition 
for accretion resources also triggers a delayed accretion 
phase which res ults in the inverse age-dependence of mas- 
sive halos, while [Ariel Keselman fc Nusserl (|2007l ) argued 
that highly non-linear effects like tidal stripping may not 
be the main dr iver for assembly bias. On the other hand, 
IZentnerl (|2006f ) implemented a toy model by substituting 
the shar p-fc filter in EST with a localized configuration 
one, and ISandvik et all (|2007l ) integrated EST with el- 
lipsoidal collapse model and barrier-crossing of pancakes 
and filaments. Both theoretical trials claimed that the 
assembly bias for massive halos could be naturally recov- 
ered, at least partly offset, b y deserting Markoy ian sim- 
plification in EST. Recently, iDalal et all (|2008t ) showed 
that the assembly bias of rare massive halos is expected 
from the statistics of peaks in Gaussian random fields, 
and they argued that the formation of a non-accreting 
sub-population of low-mass halos is respon sible for the 
assem bly bias of low mass halos (also see lHahn et al.l 
l200l) . 

As the products of gas physics within dark matter ha- 
los, galaxies have n o reason to be immu ne fro m this en- 
vironm ental effect. ICroton et al.l (|2007l ) and IZhu et ail 
(|2006| ) showed that environmental effect is transmitted 
to the clustering and properties of galaxies in semi- 
analytic model (SAM) and smoothed particle hydrody- 
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namics (SPH) simulation, both of which extract halo 
merging histories directly from simul ations rather than 
Ma rkovian proces s . Ob servationally, I Yang et"aL1 (|2006f ) 
and iBerlind et al.l (|2006f ) found a residual dependence of 
galaxy clustering on group properties other than group 
mass by using group catalog from the Two-Degree Field 
Galaxy Redshift Survey (2dFGRS: IColless fe et al.lf200lh 
and t he Sloan Digital Sky Survey (SPSS; lYork fc et all 
120001) . respectively. 

In modeling the galaxy clustering, the halo occupation 
distribution (HOD) or the closely related conditional lu- 
minosity function (CLF) is a powerful method to put 
the observed galaxy clustering in an informative form of 
describin g the relati on between galax ies and dark mat- 
ter hal os (Ping et al.|[l998t ISeliaki 20001: iPeacock fc Smith! 
20001: iScoccimarro et al.l (20011: ICoorav fe Shetbl 120021; 
Berlind fc Weinberdl200a lYang et al.ll2003t IZheng et alj 
20051) . It successfully explains the departure from a 
power-law in the observed ga laxy two-point correlation 
functions f Zchavi et al.l [20051 ) and bridges the gap be- 
tween high-resolution TV-body simulations of dark mat- 
ter particles and large scale surveys of galaxies. HOD 
modeling also enhances the power of galaxy clustering on 
constraining cosmological parameters by linking galaxies 
to dar k matter halos and using the clustering data on all 
scales (van den Bosch et al. 20031 lAbazaiian et al J 120051 : 
IZheng fc~~W cmbcrg 2007]). However, one key assumption 
in the current version of the HOD is based on the EST 
that the formation and the distribution of galaxies within 
halos are statistically determined solely by halo mass. 
Therefore, it is important to quantify the environmen- 
tal effect on modeling galaxy clustering within the HOD 
framework in the era of precision cosmology and provide 
insights to improve the HOD modeling. 

A natural way to study the environmental effect is to 
extend the current HOD framework by including a sec- 
ond halo variable other than halo mass and compare 
the modeling results with previous results. Many can- 
didates of halo variables, such as formation time, con- 
centration, substructure richness, and spin, have been 
scrutinized but all were proved incapable o f capturing en- 
vironmental effect neatly a nd completely (|Gao fc White! 
[20071: iWechsler et all 12006). One of the reasons for this 
is that halo formation history is subject to incidental 
merging events and uneven accretion phases, both of 
which produce a large scatter in the r elation between 
any halo property and the environment (|Wechsler et alJ 
[20021 : iZhao et alJl2003h . 

In the present study, we shuffle a semi-analytic galaxy 
sample to produce three sets of artificial samples, which 
either partly or completely lost their environmental fea- 
tures, and investigate the changes in real-space and 
rcdshift-space galaxy clustering statistics. The shuffling 
would enable us to see the consequences of neglecting 
the environmental dependence in the current version of 
HOD modeling and give us ideas of the effect on con- 
strain ing cosmolog i cal p arameters using these statistics 
(e.g., iTinker et alJ 120061 ). The structure of the paper 
is as follows. In § 2, we introduce the simulation and 
the SAM model we use and describe our construction of 
galaxy samples with different threshold luminosities from 
the SAM. In § 3, we present three methods of shuffling 
the galaxy samples aimed to eliminate the environmen- 
tal dependence. Then, in § 4, we analyze in detail the 



effect of environments on the real-space two-point cor- 
relation functions (2PCFs) by comparing the results be- 
tween samples before and after shuffling. In § 5, we study 
the effect of environments on the redshift-space cluster- 
ing statistics. We conclude in § 6 with a brief discussion 
and summary. In the Appendix, we present a method to 
decompose the 2PCFs into different components based 
on the properties of galaxy pairs. This method can be 
generalized to apply to real data to measure the two- 
point auto-correlation functions of galaxy sub-samples 
in a volume-limited galaxy sample and their two-point 
cross-correlation functions in a single run utilizing only 
one random catalog. 

2. SIMULATION DATA AND SEMI-ANALYTIC MODEL 

In this study, we make use of outputs from a galaxy 
formation model based on t he Millennium Simulation. 
The Millennium Simulation (jSpringell [20051 ) follows the 
hierarchical growth of dark matter structures from red- 
shift z = 127 to the present. The simulation adopts 
a concordance cosmological model with (f2 m , f^A, ^b, 
cr 8 , /i)=(0.25, 0.75, 0.045, 0.9, 0.73), and employs 2160 3 
particles of mass 8.6 x 10 8 /i _1 Mq in a periodic box 
with comoving size 500 /t~ 1 Mpc on a side. Friends-Of- 
Fricnds (FOF: iDavis et alJ 119851) halos arc identified in 
the simulation at each of the 64 snapshots with a link- 
ing length 0.2 times the mean particle separation. Sub- 
structures are then identified by SUBFIND algorithm as 
locally overdense reg ions in the background FOF halos 
(|Springel et al.ll2001h . Detailed merger trees of all grav- 
itationally self-bound dark matter clumps constructed 
from this simulation provide a key ingredient for semi- 
analytic models of galaxy formation. 

The galaxy cat alog we use is from t he se mi-analytic 
model (SAM) of iDe Lucia fc Blaizotl (I2007D which is 
an updated version o f that of ICroton et all (|2006[ ) and 
IDe Lucia et alJ (120061 ). This model explores a variety of 
physical processes related to galaxy formation. It can 
reproduce many observed properties of galaxies in the 
local universe, including the galaxy luminosity function, 
the bimodal distribution of colors, the Tully-Fisher re- 
lation, the morphology distribution, and the 2PCFs for 
various type and luminosity selected samples. This par- 
ticular model is of course not guaranteed to be abso- 
lutely right. What is important to our study here is that 
the environment-dependent ingredients inherent in this 
model, such as the history of dynamical interactions and 
mergers of halos, should be well transmitted to and pre- 
served in the resultant galaxy population. We aim to 
investigate the likely effects of the environmental depen- 
dence in this model on galaxy clustering statistics in real 
space and redshift space and explore the implications for 
cosmological study with galaxy clustering data. 

We construct six luminosity-threshold galaxy samples 
at z — from the SAM catalog according to the rest- 
frame SDSS r-band absolute magnitude M r with dust 
extinction included. Table [T] lists the properties of these 
samples. Our L207 sample has a number density sim- 
ilar to the observ ed L > L* sample (see Table 2 of 
IZehavi et aill2005f) . Since more lu minous galaxies ten d 
to reside in more massive halos (e.g.. lZehavi et ai1l2005| ). 
these six samples can probe different halo mass ranges 
(from mass below M* to that above M»). The halo as- 
sembly bias has different amplitudes and signs across 
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TABLE 1 

Properties of the Luminosity- threshold Samples 



Name 




n (10- 2 /i 3 Mpc" 3 ) 


N gal 


Nhalo 


L190 


-19.0 


1.835 


2293947 


1661007 


L200 


-20.0 


0.771 


963452 


730330 


L207 


-20.7 


0.293 


365845 


289226 


L210 


-21.0 


0.167 


209206 


169864 


L217 


-21.7 


0.032 


39402 


34129 


L220 


-22.0 


0.013 


16084 


14148 



these mass ranges (e.g., IGao et al.1 [20051 IGao fc Whltel 
[2007t iWechsler et all l200l |ji"ng et all 12001 . we there- 
fore expect different environmental effects from the six 
samples. 

3. SHUFFLING SCHEMES 

Our purpose in this paper is to study the impact of 
environmental dependence on the HOD modeling of real- 
space and redshift-space clustering statistics. Besides the 
six galaxy samples from the SAM, for comparison we also 
need galaxy samples with the environment al dependence 
eliminated. Following I Croton et al.l (|2007f ). we construct 
such control samples from the original SAM catalog by 
shuffling galaxy contents in halos of similar masses. We 
produce three sets of control samples based on three shuf- 
fling schemes described below. 

We first group all the FOF dark matter halos at z = 
with Myi t larger than 5.5 x 10 10 h~ 1 M ( ? ) in the catalog into 
different mass bins of width Alog[M vir /(/i _1 M )] = O.f . 
Then we record the relative positions and velocities of 
all the satellites to their affiliated central galaxies, whose 
positions and velocities are set to those of their host halos 
in the SAM. Finally, we redistribute the galaxies within 
individual halo mass bins. The three sets of our control 
samples (hereafter CTLf , CTL2 and CTL3, respectively) 
differ in the way how galaxies are r e-distributed. 

For CTLf, we follow the scheme of lCroton et~all (|2007l ) 
to keep the original configuration of galaxies inside each 
halo intact and move the galaxy content to its new host 
as a whole. In this way, the one-halo term contribution 
to the galaxy clustering statistics is almost unchanged. 
In order to compensate for the non-zero mass bin effect in 
the shuffling, we scale the recorded relative position and 
velocity of each galaxy by (-Mnow/Moid) 1 / 3 in order to 
redistribute the galaxies in the original halo of mass M id 
to the new host halo of mass M new . This improvement 
ensures that the position of shuffled galaxy content be 
regulated by the virial radius of new host halos. 

For CTL2, we collect the distance r to the halo center 
and the velocity v relative to the halo center for all the 
satellite galaxies that belong to halos in the same mass 
bin. Then a pair of r and v are randomly drawn from 
the sets and assigned to a galaxy. This galaxy is put into 
a randomly selected halo in that mass bin with random 
orientations for both r and v [with the (M nevl / M \<±) 1 /' i 
scaling applied]. For central galaxies, they are randomly 
assigned to halos of the same mass bin. This shuffling 
procedure assumes a mean radial galaxy number density 
profile for all halos in the same mass bin and completely 
eliminates any environmental features in the galaxy dis- 
tribution inside halos, including the alignment and segre- 
gation of satellites, the non-spherical shape of halos, the 
infall pattern of satellite velocity distribution, and any 



correlation between central and satellite galaxies (e.g., 
in luminosity). CTL2 would allow us to infer the largest 
effect that environment may have on galaxy clustering 
statistics for the given SAM. 

In addition to CTL1 and CTL2, we construct another 
set of samples (CTL3) by isotropizing satellites inside 
their own halo without shuffling contents between differ- 
ent halos. In this way, the radial distribution of galaxies 
in each individual halo is conserved, but the statistical 
angular distribution loses the anisotropy. CTL3 allows us 
to isolate the effect of assuming spherical symmetry for 
the satellite distribution in modeling 2PCFs. Although 
CTL2 also isotropizes the satellite distribution inside ha- 
los, it effectively uses a radial distribution averaged over 
halos of similar masses. Therefore, comparing CTL3 and 
CTL2 would show the effect of the scatter in the distri- 
butions of satellites in halos of similar masses. 

For each of the CTL1, CTL2, and CTL3 shuffling 
schemes, we create 10 different galaxy catalogs varying 
the random seed. We extract the 6 control luminosity- 
threshold samples from each shuffled catalog in accor- 
dance with the above LI 90, L200, L207, L210, L217 and 
L220 samples of the SAM. To prevent numerical effects 
from mixing with the physical effects we are to ascertain, 
we have performed tests by reducing the size of halo mass 
bins or leaving several most massive bins unshuffled and 
find that our choice of the bin size does not introduce 
noticeable numerical effect. 

4. ENVIRONMENTAL EFFECT ON REAL-SPACE 2PCFS 

We start from comparing the real-space 2PCFs of the 
original SAM samples and the shuffled samples. The 
2PCFs essentially describe the pair count as a function 
of pair separation. On small (large) scales, galaxy pairs 
are dominated by one-halo (two-halo) pairs, i.e., intra- 
halo (inter-halo) pairs. 

Our results on the real-space 2PCFs are shown in Fig- 
ured On small scales (< 2/i _1 Mpc), where the one-halo 
term dominates, 2PCFs from CTL1, CTL2 and CTL3 
behave differently. In CTLf, galaxy contents inside halos 
as a whole are exchanged among halos of similar mass, so 
we do not expect any appreciable change in the one-halo 
regime of the 2PCFs. Thus, the small-scale clustering in 
CTLf remains almost the same as that of the original 
one, as seen in Figure [TJ The slight differences seen in 
the plot are a result of the finite mass bin. CTL3 makes 
satellite distribution inside halos isotropic, which on av- 
erage enlarge the separations of intra-halo galaxy pairs. 
So on small scales, 2PCFs of CTL3 are always smaller 
than those of SAM with a suppression of around 10%. 
In CTL2, not only the angular distribution of satellite 
galaxies inside halos are isotropized but the radial galaxy 
number density profile is averaged within the same mass 
bin, which completely erases the memories of galaxies 
about their environments. Figure [TJi shows that galax- 
ies in CTL2 exhibit a suppression up to ~10% for L190 
samples and the suppression becomes weaker for samples 
with higher threshold luminosity (e.g., sample L2I0 in 
Fig.©). The 2PCF for the L2f7 CTL2 sample (Fig. [ft) 
is too noisy to tell the trend, but it is likely to still be a 
suppression (see below). 

On large scales, where the two-halo term dominates, 
2PCFs from CTL3 stay the same as those from the orig- 
inal SAM samples, since the CTL3 scheme only shuf- 
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FIG. 1. — The comparison of 2PCFs between SAM and shuffled 
samples for three different luminosity thresholds. The lower part 
of each panel gives the ratio of the 2PCFs of the shuffled and 
the original SAM samples. Solid lines are the 2PCFs for SAM 
samples, while dashed, dot-dashed, and dotted lines are those for 
the CTL1, the CTL2 and the CTL3 shuffled samples, respectively. 
See the text. The (small) error bars reflect the scatter from the 10 
realizations for each shuffled sample. 



< 

m 

\ 



1.2 



1.1 - 



1 - 



0.9 



0.8 



t — i — i — r 




j i i i_ 



-19 



-20 



-21 



-22 



M 



Fig. 2.— The ratio of the 2PCF of shuffled sample to that of the 
SAM sample as a function of magnitude limit of sample, averaged 
over scales of 5-25/i — 1 Mpc. In the SAM catalog used in this paper, 
L* corresponds to M r ~ —20.7. Since the ratio is nearly constant 
on those scales, we do not show the small error bars. 



fles galaxies in each halo. It is not surprising that the 
large-scale 2PCFs of CTL1 and CTL2 are almost iden- 
tical, given that they both shuffle halos of similar mass. 
The difference between the 2PCF in the CTL1/CTL2 
sample and the original SAM sample shows a steady 
trend with the threshold luminosity. For the faint sam- 
ple (M r < —19), the environmental dependence of the 
galaxy population and the assembly bias lead to a ~10% 
suppression in the 2PCF after shuffling (Fig. [Hi). For the 
intermediate sample M r < —21, the difference between 
shuffled and SAM samples is reduced to ~2% (Fig. Q})). 
For the bright M r < -21.7 sample, the 2PCFs of shuffled 
ones become ~3% larger than those of the SAM sample 
(Fig.[T];). To see the trend more clearly, we show in Fig- 
ure [2] the ratio £/£ SAM of large-scale 2PCFs of the shuf- 
fled (CTL1 or CTL2) and the SAM samples as a function 
of the magnitude limit. The ratio £/£ SAM is calculated 
by averaging the measurements for each 10 shuffled sam- 
ple on scales of 5 — 25ft. _1 Mpc. We note that the trend 
of the ratio wit h the thresho l d lum inosity is the same as 
in Figure 2 of ICroton et alJ (|2007h . although we use a 
different indicator for the large scale difference. 

For a better understanding of the change of cluster- 
ing strength in the shuffled samples with respect to the 
original samples, we decompose the galaxy 2PCFs into 
five components according to the source of galaxy pairs 
and examine them individually. The five components 
are denoted as lh-cen-sat, lh-sat-sat, 2h-cen-cen, 
2h-cen-sat, and 2h- sat -sat, where lh and 2h refer to 
one-halo and two-halo pairs and cen and sat tell the 
nature (central or satellite galaxies) of the pair of galax- 
ies. That is, we have central galaxy in a halo paired 
with satellites in the same halo (lh-cen-sat), satellite 
galaxy pairs inside halos (lh-sat-sat), central galaxy in 
one halo paired with central galaxy in a different halo 
(2h- cen- cen), central galaxy in one halo paired with 
satellites in a different halo (2h-cen-sat), and satel- 
lites in one halo paired with satellites in a different halo 
(2h-sat-sat). A detailed description on how we sepa- 
rate these components can be found in the Appendix. 

Figure [3] shows the five 2PCF components of the orig- 
inal and shuffled samples for L190, L210 and L217 (left, 
middle and right columns for CTL1, CTL2 and CTL3, 
respectively). Since the spatial distribution of satellites 
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Fig. 3. — Comparison of 2PCFs between SAM and shuffled M r < —19, -21 and -21.7 samples by decomposing the 2PCF into five 
separate components. Solid and dotted lines are the overall 2PCF for SAM and shuffled samples, respectively. The shuffled sample in 
the left (middle, right) column is from the CTL1 (CTL2, CTL3) shuffling scheme (see the text). The five components correspond to 
contributions from one-halo central-satellite galaxy pairs, one-halo satellite-satellite galaxy pairs, two-halo central-central galaxy pairs, 
two-halo central-satellite galaxy pairs, and two-halo satellite-satellite galaxy pairs, respectively. See the text and the Appendix for more 
details. 



inside halos is conserved in the CTL1 sample [except for 
the (Mnew/Moid) 1 / 3 scaling], there is almost no change 
in the one-halo components for this sample with re- 
spect to the original sample. In the shuffling schemes of 
CTL2 and CTL3 samples, satellites inside halos are an- 
gularly redistributed from a non-spherical distribution 
to an isotropic distribution. The redistribution in ei- 
ther CTL2 or CTL3 does not change the separations of 
one-halo central-satellite galaxy pairs, so the lh-cen-sat 
component does not change after shuffling, as seen in Fig- 
ure [3J However, the isotropization statistically increases 
the separations of one-halo satellite-satellite pairs and 
thus dilutes the lh-sat-sat clustering signal. This leads 
to a suppression of the 2PCF on small scales with re- 
spect to the original sample (e.g., ~10% for L190). CTL2 
samples show a smaller suppression in the lh-sat-sat 
component than CTL3 samples. There may be two rea- 
sons for the difference. First, CTL2 effectively uses a 



mean radial distribution profile of satellites in halos of a 
given mass, while CTL3 uses the radial distribution in 
each individual halo. Because of the scatter in the radial 
profiles at a given halo mass, the distributions of one- 
halo satellite-satellite pair separations are not identical 
from the mean and individual profiles. Second, CTL2 
ensures that the numbers of satellites inside halos of a 
given mass follow the Poisson distribution, while CTL3 
follows the distribution in the SAM sample, which can be 
slightly sub-Poiss on in the low occupation regime (e.g., 
IZheng et al.ll2005t ). 

For the shuffled samples CTLI, CTL2 and CTL3, 
all two-halo components (except 2h-cen-cen) show en- 
hancements on scales less than l/i _1 Mpc with respect 
to the original ones. This is mainly caused by the non- 
spherical distribution of satellite galaxies inside halos in 
the SAM sample. The shuffling procedure can cause the 
satellite populations of two neighboring (non-spherical) 
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halos to become spatially close or even overlapped to 
some extent. Therefore, in the shuffled samples the prob- 
ability of finding close inter-halo galaxy pairs that involve 
satellites increases. However, such small-scale enhance- 
ments in the two-halo components only occur on scales 
that the one-halo term of the 2PCF dominates, thus, 
they are of no interest in our analysis. 

On large scales, where the two-halo term regime domi- 
nates, the two-halo components for CTL3 do not change 
since it only shuffles galaxies within halos, while every 
two-halo component changes its amplitude after shuffling 
with CTL2 and CTL3. There is no doubt that this should 
be a manifestation of the environmental dependence of 
the halo clustering and that of the galaxy content inside 
halos. Let us first consider the 2h-cen-cen component. 
The effect of shuffling central galaxies is equivalent to 
that of shuffling halos. If the galaxy sample were a halo- 
mass-threshold sample, shuffling would not change the 
large scale clustering of central galaxies as the host halo 
population remains the same after shuffling. However, 
the sample we consider is defined by a threshold in lu- 
minosity and it is not a halo-mass-threshold sample be- 
cause of the scatter between halo mass and central galaxy 
luminosity At a fixed mass, older halos tend to host 
more luminous central galaxies, and the mean central 
;alaxy luminosit y is an increasing function of halo mass 
Zhu et ai1l2006| ). We thus expect that, at a given lumi- 
nosity, a central galaxy can reside in a low mass older halo 
or in a younger halo of higher mass. That is, for low mass 
halos, only a fraction of them (some older ones) can host 
the galaxies in our luminosity-threshold sample. Since 
the shuffling is among halos of the same mass, some cen- 
tral galaxies of the sample in these low mass older halos 
are moved to younger halos in the same mass bin after 
shuffling. For the LI 90 samples, these low mass halos are 
in the regime where the clustering of younger halos are 
weaker, so we see a decrease in the 2h-cen-cen compo- 
nent of the 2PCF after shuffling [Fig.^l) and Fig. Hp)]. 
However, halos at the low mass end in L2I7 samples 
are in the regime where the clustering of older halos are 
weaker, leading to an increase in the 2h-cen-cen com- 
ponent after shuffling [Fig. [3^7) Fig. Hp)]. For the L2I0 
samples, the low mass halos are in the regime where the 
age dependence of halo clustering almost disappear, and 
as a consequence, the 2h-cen-cen component does not 
change much after shuffling [Fig. [3^4) and Fig. [3f5)]. 

Unlike the 2h-cen-cen component, for which the ef- 
fect of shuffling is determined by the halos near the 
low-mass end for the given sample, the 2h-cen-sat 
and 2h-sat-sat component s are influenced by all ha- 
los above the low-mass end. IZhu et al.1 (|2006[ ) find that, 
in general, at a fixed halo mass, there are fewer satellite 
galaxies in older halos. Combining this finding with the 
age dependence of halo clustering (i.e., older halos being 
more strongly clustered in the mass range appropriate for 
our sample) , one would infer that shuffling would increase 
the amplitudes of the 2h-cen-sat and 2h-sat-sat com- 
ponents, since the overall effect of shuffling is to homog- 
enize satellite populations among halos of different ages 
(i.e., to increases/decrease the number of satellites in 
older/younger halos). However, this naive expectation 
is contradictory to what is seen in Figure [3] Then, what 
is the reason for the suppression in the 2h-cen-sat and 
2h-sat-sat components? The answer lies in the rich- 
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Fig. 4. — Contributions to the large scale bias factor from dif- 
ferent pair components as a function of magnitude limit of galaxy 

(i) 

sample. The component contribution Q s to the large scale 2PCFs, 
normalized by the matter 2PCF, includes 2h-cen-cen (black), 
2h-cen-sat (green), and 2h-sat-sat (magenta). The blue curve is 
the overall squared bias factor. Solid and dashed lines are for the 
original SAM and the CTL shuffled samples, respectively. See the 
text for more details. 

ness dependence of halo clustering. Here, the term "rich- 
ness" refer s to subhalo/substruc ture/satellitc abundance 
in a halo. iGao fc White! (|2007f ) show that, in the mass 
range relevant here, halos with more substructures are 
always more strongly clustered. Since substructures are 
the natural dwellings of satellite galaxies, we expect that, 
at a fixed mass, halos that have more satellites are more 
strongly clustered. The effect of shuffling is to move some 
satellites in strongly clustered halos to weakly clustered 
halos and thus lower the amplitude of the 2h-cen-sat 
and 2h-sat-sat components of the 2PCF. 

The above explanation of the two-halo term change 
still leaves with one question. According to the age- 
dependence of halo clustering (older halos are more 
strongly clustered) and the anti-correlation between age 
and subhalo abundance (older halos have fewer subha- 
los; IGao et al.ll200l IZhu et al.l[2006h . one would expect 
that halos with fewer satellites are more strongly clus- 
tered , in sharp contrast w ith what is found in simulation 
(e.g.. lGao fc White! [20?37m . The solution to the apparent 
contradiction lies in the scatter in the anti-correlation be- 
tween age and richness and the joint dependence of halo 
clustering on age and richness (Zu ct al. in preparation). 

Figure [4] summarizes the contributions of two-halo 
components to the large-scale 2PCFs and the changes 
caused by CTL1/CTL2 shuffling as a function of thresh- 
old luminosity. We plot the contributions from differ- 
ent two-halo components to the large-scale bias factor 
(squared) for both the original SAM samples (solid) and 
CTL1 samples (dashed). Each component contribution 
to the square bias factor is computed by averaging the 
ratio of the corresponding two-halo 2PCF component 
(2h-cen-cen, 2h-cen-sat, or 2h- sat -sat) to the mat- 
ter 2PCF on scales of 5-15/i -1 Mpc. For galaxy samples 
with low threshold luminosity, the largest contribution 
to the large-scale clustering comes from the 2h-cen-sat 
component. The 2h-cen-cen component then takes over 
for samples with threshold luminosity around L* and be- 
comes more and more dominant towards higher lumi- 
nosity. This trend can be understood by noticing that 
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the satellite fract ion decreases with increasing threshold 
luminosity (e.g., IZheng et aTll2007h . The 2h-sat-sat 
component always has the least contribution to the the 
large-scale clustering. 

Figure 0] shows that shuffling causes the 2h-cen-cen 
component to be suppressed slightly for samples with low 
luminosity thresholds and to be enhanced a little bit for 
samples with high luminosity thresholds, a trend can be 
explained by the age dependence of halo clustering as dis- 
cussed above. Here "low" and "high" are with respect to 
L„. We note that the change in the 2h-cen-cen compo- 
nent decreases again at the very high luminosity end (i.e., 
the L220 sample) , similar to the trend seen in the concen- 
tration depende nce of massive halo clustering (see e.g., 
IJing et al. 2007|). The 2h-cen-sat component is always 
suppressed after shuffling, which can be understood by 
the richness dependence of halo clustering as mentioned 
above. The change of the overall large-scale bias factor is 
dominated by that of the 2h-cen-sat at low luminosity 
and very high luminosity. The change of the 2h-cen-cen 
component plays a role in determining that of the overall 
bias factor for samples with luminosity threshold larger 
but not much larger than L» , and it nearly compensates 
the suppression caused by the change of the 2h-cen-sat 
component, leading to little change (<2%) in the overall 
bias factor (also see Fig. [2]). 

5. ENVIRONMENTAL EFFECT ON REDSHIFT-SPACE 
2PCFS 

While the 2PCFs in real space are isotropic, the 2PCFs 
in redshift space are distorted by galaxy peculiar veloc- 
ities along the line of sight. On small scales, the ran- 
dom virialized motions of galaxies in groups and clusters 
stretch the redshift distribution of galaxies along the line 
of sight, producing the so-called "fmger-of-god" (FOG) 
effect. On large scales, the coherent flows of galaxies due 
to gravity squash t he line-of-sig ht distribution of galaxies 
(i.e., Kaiser effect; [Kaiser f 987). 

In linear theory, the large scale redshift-space distor- 
tion measures a combination of £l m and the large scale 
galaxy bias factor b g , which is £l®f /b g . Given the mea- 
sured amplitude of galaxy 2PCF, a constraint on Sl^ 6 /6 9 
is equivalent to that on agil^ 6 , where <jg is the rms mat- 
ter fluctuation on scale of 8/i _1 Mpc. To infer such a 
constraint on large scales, the small-scale redshift distor- 
tion is usually de alt with simple m o dels, such as t he ex - 
ponential model (|Cole et al.lll995f) . ITinker et all 1)20061 ) 
demonstrates that by taking advantage of the power of 
HOD to describe clustering in a fully non-linear man- 
ner, one can consistently mode l the small-sc ale and the 
large- scale clustering (als o see iTinkerl |2007| ). Further- 
more, ITinker et afl (|2006h shows that the de gencracy m 
ft m and us from large scale clustering can be broken 
by making use of the small- and intermediate-scale clus- 
tering in redshift s pace. The HOD framework used in 
ITinker et al.l (|2006l ) assumes no environmental effects on 
halo clustering and galaxy content inside halos. In § [4[ 
we have shown that in the SAM we use, the real-space 
2PCFs may suffer a change up to ~10% because of the 
environmental effect. It is interesting to perform similar 
analysis in redshift space and discuss the implications in 
inferring cosmological parameters from redshift distor- 
tions. 

In Figure [5l we show the redshift-space 2PCFs 
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Fig. 5. — Comparison of redshift-space correlation functions 
between SAM samples and shuffled samples. Panels (a), (6), (c) 
are for L190, L210, and L217 samples, respectively. In each panel, 
three quadrants shows the comparison between one shuffled sample 
and the SAM, with black contours for the original SAM sample, 
green, red, and blue contours for the CTL1, CTL2, and CTL3 
shuffled samples, respectively. Contour levels are set as 2 n , with n 
from -5 to 2. 
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£(r p ,rv) 7 measured from the SAM, CTL1, CTL2, and 
CTL3 catalogs for the three luminosity-threshold sam- 
ples as in Figure[TJ where r p and are the perpendicular 
and linc-of-sight distances in rcdshift space. The overall 
effect of the shuffling on the redshift-space 2PCFs is simi- 
lar to that seen in the real-space 2PCFs. On small scales, 
where the FOG effect dominates, the clustering ampli- 
tudes of CTL1 samples (green contours) are almost iden- 
tical to those of the SAM samples (black contours) , since 
shuffling does not change the one- halo term in CTL1. For 
CTL2 samples (red contours), the FOG effect is slightly 
suppressed and the small scale clustering is weaker than 
that of the SAM for LI 90, but nearly identical to and 
stronger than that of the SAM for L210 and L217, respec- 
tively. For CTL3 samples (blue contours) , although they 
exhibit large difference in real-space 2PCFs, the FOF 
effect only changes a little, which indicates that the dif- 
ference caused by galaxy angular distribution is partly 
masked by the peculiar velocity field on small scales. 

On large scales, similar to what is seen in the real- 
space 2PCFs, at a given large-scales separation (r p ,rv), 
the redshift-space 2PCF of CTL1 or CTL2 sample has a 
lower amplitude than that of the SAM sample for L190; 
it has an almost identical amplitude to that of the L210 
SAM sample; it exceeds that of the L217 SAM sample. 
The 2PCF amplitudes do not change with CTL3 sam- 
ples. As a whole, shuffling introduces changes similar to 
those in the real-space 2PCFs, and these changes can be 
understood following interpretations in § [4j 

To further quantify changes in the redshift-space 
2PCFs, we calculate a few statistics derived from the 
multipoles of the real-space and red shift-space 2PCFs , 
which were originally proposed by iHamiltonl (|l992f ). 
These statist i cs are also the ones use d in the study of 
iTinker et all (|200l and iTinkerl |2007f) for HOD model- 
ing the redshift-space distortion. 

The multipole moments £;(r) are given by the coeffi- 
cients of the Legcndrc polynomial expansion of £(r p , r v ). 

2/ 4- 1 f +1 

6(0 = ^- J i a^r^P^W (1) 



where r = \j r2 p + r2 , A* = r 7r/ r J and Pi(^) is the l-th 
order Legendre polynomial. Based on the multipoles of 
£( 7 >> r v), we calculate two statistics. The first one is the 
ratio of the monopole £o(r-) to the real-space 2PCF £r(t-), 

6>(r) 



£o/fl( r ) 



(2) 



(3) 



In linear theory, it is a function of (3 = Cl^/b g , 

The second quantity Q^{r) is related to the quadrupole 
6(r), 

fo(r) -£o(r0 
where £o( r ) is the volume-averaged monopole, 



(5) 



6>(r) = -? / Ms'ds 



7 We use the same symbol § for both the real-space and redshift- 
space 2PCFs. Whenever it introduces a confusion, we will add a 
subscript R for real-space quantities. 



In linear theory, Q^(r) is also a function of /3, 



(6) 



ITinker e~a l. (2006) also introduce a quantity r^i 2 , which 
is the value of rv at which the redshift-space 2PCF at the 
given r p decreases by a factor of 2 with respect to the 
value of 2PCF at rv = 0. We also compute this quantity. 

Figure [5] plots £ /_R an d as a function of r and r^/ 2 
as a function of r p for the L190 sample using both linear 
and logarithmic axes to highlight large and small scales 
separately. According to the above results, this sample, 
among the six luminosity-threshold samples, is expected 
to show the largest environmental effect (we also check 
the luminosity-threshold sample with magnitude limit 
M r = —18.0 and find that the large scale suppression 
is still at the 10% level as it is in L190). 

Compared to the SAM sample, the monopole term 
£ Q/R in cither CTL1 or CTL2 is only -2% higher on 
large scales, where it stays the same in CTL3. On 
small scales (0.1-l/i~ 1 Mpc), the difference is at a level 
of 5% in CTL1&2. Only on extremely small scales 
(< 0.1/i _1 Mpe) does £ /fl °f the CTL2 sample show a 
10% drop, which is not important since the error bars are 
large and these scales are likely to be excluded in HOD 
modeling. In the CTL3 sample, since the suppression of 
the spherically-averaged redshift-space 2PCFs and that 
of the real-space 2PCFs cancel with each other on small 
scales, ^o/ii stays at the same level as in the SAM sample. 

For the quadrupole term Q^, the difference between 
the results of the original and shuffled samples is well 
within 5.5% in CTL1/CTL2 for most scales. Note that 
the difference extends all the way to the largest scales 
in CTL3 at a 0.5% level, which means that the non- 
linearity still affects clustering behaviors on linear scales 
in redshift-space. Also note that the large fractional 
differences around 10/i _1 Mpc are simply because is 
crossing zero. The fractional changes in £o/ii and Q{ 
caused by shuffling are much less than those in the real- 
space 2PCFs, which are at a level of 10%. For the 
quantity the global enhancement in CTL3 is caused 
only by the angular isotropizing of galaxies, which dis- 
rupts the original compact configuration of SAM halos. 
This makes the redshift-space 2PCFs harder to decrease 
to the half value of £| r7r= o at a given r p , especially at 
r p < 0.2/i Mpc, where the enhancement becomes much 
eminent. In CTL2, r^/ 2 shows a similar behavior as in 
CTL3 at small r p for the same reason, then it becomes 
smaller than that in CTL3 at r p > 0.2/i _1 Mpc, while 
from the CTL1 shuffled sample is consistent with 
that from the original sample within the error bars. 

How would the above changes in the redshift distortion 
statistics induced by the environmental effect affect the 
inference of cosmological parameters from HOD model- 
ing? For a complete answer of thi s questi o n, on e needs to 
perform the analysis presented in [Tinker] (|2007f ) with the 
2PCF measurements in the original and shuffled samples 
and compare the results in the inferred cosmological pa- 
rameters. However, even without the full analysis, we 
can still figure out the likely magnitude by using the lin- 
ear theory results [equations © and ©]. For the L190 
sample presented in Figure [6l a 2% increase in the large- 
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Fig. 6. — Comparison of redshift-space clustering statistics between the SAM and shuffled samples for the L190 galaxy sample. Top 
panels are for £o/i?i * ne ratio of the monopole of the redshift-space 2PCF to the real-space 2PCF; middle panels are for Q^, which is related 
to the quadrupole of the redshift-space 2PCF; bottom panels are for ^5/21 which is the value of r n at which the redshift-space 2PCF at a 
give r p decreases by a factor of 2 relative to its value at r n = 0. In each panel, black solid curves stand for the SAM sample, while the green 
dashed, the red dot-dashed and the blue dotted curves indicate the CTL1, CTL2, and CTL3 shuffled samples, respectively. The horizontal 
axes are shown in linear (logarithmic) scales in the left (right) panels to highlight the large (small) scale behavior. Error bars are plotted 
only for the SAM sample to avoid crowding and those for the shuffled ones are comparable. The abnormal error bars around 10 h~ x Mpc 
in the middle panels of panels (c) and (d) are because of there approaching zero. 
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scale £o/-R only leads to ~6.5% increase in the inferred 
ft. For Q^, shuffling gives rise to a 5.5% increase on 
large scales, which also translates to a ^6.5% increase in 
ft. For it is not straightforward to see the conse- 

quence. Based on Figure 16 of iTinker et all (l2006h . it is 
likely that the difference in r^/ 2 011 large scales between 
the SAM and shuffle samples can at most lead to a ^5% 
uncertainty in constraining a&. We note that the effects 
of the shuffling on both the real-space and redshift-space 
clustering statistics are likely from the same cause in that 
the ~12% decrease in the real-space 2PCF £r(t-) leads to 
6% decrease in galaxy bias b g , which in turn corresponds 
to a ~6.5% increase in ft, about the number we infer 
from redshift-space clustering statistics. Since the shuf- 
fling induced changes in the real-space 2PCFs for other 
brighter samples are smaller, the environmental effect on 
ft is expected to be smaller for them. Therefore, in the 
SAM galaxy catalog we use, neglecting any environmen- 
tal dependence of halo clustering and galaxy formation 
is likely to cause a less than 7% systematic uncertainty 
in constraining a%il® r f. 

6. SUMMARY AND DISCUSSION 

In this work, we investigate the effect of environmental 
dependence of halo clustering and galaxy formation on 
real-space and redshift-space clustering of galaxies. Our 
study makes use of the ga laxy catalog from the SAM of 
iDe Lucia fc Blaizotl (l2007|). which is based on the Millen- 
nium Simulation ( Sprin gcl 2005). The inherent depen- 
dence of galaxy properties on environment in the SAM 
catalog is eliminated by shuffling galaxies among halos 
of similar mass. 

The real-space 2PCFs in the original sample and those 
in the shuffled samples have a difference at the level 
of 10% with some dependencies on scales for samples 
with low threshold luminosities. The difference becomes 
much smaller for samples with threshold luminosities ap- 
proaching or exceeding £». We decompose the 2PCFs 
into five components by accounting for the nature of 
galaxy pairs (e.g., one-halo or two-halo, central galaxies 
or satellites) and study the effect of environment on each 
of them. In general, on large scales, the changes in the 
2h-cen-cen component of the 2PCF caused by shuffling 
can be well understood by noticing the dependence of 
halo bias and central galaxy luminosity on halo formation 
time, while those in the 2h-cen-sat and 2h-sat-sat 
components are determined by the richness (substruc- 
ture) dependence of halo bias. The 2h-cen-sat com- 
ponent appears to dominate the change in the overall 
2PCF for samples with low or very high threshold lumi- 
nosity, while the change in the 2h-cen-cen component 
nearly compensates that in the 2h-cen-sat component 
for threshold luminosity L*, where the amplitude of the 
environment effect is small. These results imply that we 
could use high-resolution iV-body simulations to accu- 
rately model the 2PCFs by associating satellites to sub- 
structures identified in halos and neglecting the environ- 
mental dependence of central galaxies at fixed halo mass. 
On small scales, the assumption of spherical symmetry in 
galaxy distribution may lead to an uncertainty as large 
as 10% in 2PCFs, but this effect could be absorbed into 
a free parameter describing the halo concentration. 

The effects of environmental dependence on redshift- 
space 2PCFs are similar to what are seen in the real- 



space 2PCFs. On large scales, the effects can be at- 
tributed solely to the change in the large scale bias fac- 
tor. For inferring cosmological parameters (erg and Q m ) 
through HOD modeling of the re dshift-space distortion 
([Tinker et al.ll2006l : iTinkeri I2007T ) . the systematic effect 
caused by neglecting the environmental dependence of 
halo clustering and galaxy formation is likely to be at 
the level of <6.5% for the worst case (the M r < —19.0 or 
L > 0.2L, sample) and can be much smaller for brighter 
samples, especially for samples with threshold luminosi- 
ties near L*. The underlying assumption of this state- 
ment is that the environmental effect on galaxy formation 
in reality is as large as that seen in the SAM we use in 
this paper. 

Our results arc based on one particular ga l axy fo rma- 
tion model, i.e., SAM of IDe Lucia fc Blaizotl (|2007f ). Al- 
though this model can reproduce many observed proper- 
ties of galaxies, it is not guaranteed to be absolutely cor- 
rect. In this model, the environmental effect on the for- 
mation and evolution of galaxies is mostly linked to the 
formation/merger history of dark matter halos. Com- 
par ed to observations, it overproduces faint red galax- 
ies (jCroton et al J 1200a ). Although the model predicts 
the correct trend of the color dependent galaxy cluster- 
ing, it predicts too large a difference between the am- 
plitu des of the 2PCFs of blue and red galaxies (| Springe! 
2005tl. By comp a ring t o the galaxies in SDSS groups, 
Weinmann et all ((2006) find that SAM produces too 
many faint satellites in massive halos and incorrect blue 
fractions of central and satellites galaxies. 

The discrepancies between observations and the SAM 
model suggest that the effect of environment on galaxy 
formation and evolution may be exaggerated in this par- 
ticular model. Such discrepancies provide opportuni- 
ties for enhancing our understanding of galaxy forma- 
tion and evolution . The re are also tests with void statis- 
tics ((Tinker et al.1 12007[). environmental dependence of 
group galaxies ~(|Blanton fc Berlindl I2007D and marked 
galaxy correlation function ( Skibba et al.l (20061 ) . which 
show that the observed properties of galaxies are mainly 
driven by host halo mass rather than the environment in 
which halos form. Therefore, in reality, it is quite pos- 
sible that the environmental effect on modeling galaxy 
clustering statistics is much smaller than what we obtain 
in this paper, and that the systematic effect on cosmo- 
logical parameter constraints from HOD modeling is not 
larger than a few percent or even better. 
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APPENDIX 

DECOMPOSITION OF THE TWO-POINT CORRELATION FUNCTION 

In the HOD framework, the 2PCF £(r) is usually decomposed into two components (e.g.. IZhendl2004 ). 

£(r) = [l+6h(r)]+6h(r), (Al) 

where the one- halo term £ih(f) and the two-halo term £211 ( r ) represent contributions from intra- halo and inter- halo 
pairs, respectively. To separate such components from measurements in a mock catalog, one only needs to weigh the 
total correlation function appropriately. That is, 1 +£( r ) weighted by the fraction of intra-halo (inter-halo) pairs at a 
separation r gives 1 + £ih(f) [1 + £,2h{r)}. However, the way to decompose £(r) into more components on the basis of 
pair counts, like what we do in this paper (central/satellite, one-halo, two-halo pairs), is not immediately clear. In this 
Appendix, we develop a method for such a decomposition. The method can be generalized to apply to real data: for 
example, one is able to measure the two-point auto-correlation functions of red and blue galaxies and their two-point 
cross-correlation functions in a single run with only one random catalog. 

We first provide a general consideration on the component separation and then describe the decomposition used in 
this paper in more details. 

General Consideration 

Let us start from the definition of the two-point correlation function, 

£(r)H<5(x)<5(x + r)>, (A2) 
where (...) represents an ensemble average. The overdensity field S is defined as 

. n(x) — n 

S(x) = , (A3) 

where n(x) is the galaxy density at x and n is the mean. Let us assume that the galaxy sample is composed of several 
sub-samples, n(x) = £\ rii(x). Now we decompose the overdensity into different components based on sub-samples 



i 

where 64 is the overdensity contributed by the z-th component (sub-sample). 



(A4) 



~ ni - rii m - n,i rii nt 

Si = — = — = — x — = dj — . (A5) 

n rii n n 

Note that in the above equation, Si is the i-th component's own overdensity field (i.e., fractional fluctuation with 
respect to n,; instead of n). 

Substituting equations (|A4[) and (|A5[) into (|A2[) . we obtain 



£(r) =2(5 i (x)5 i (x + r))^ + X;(5 i (x)« i (x + r))^. (A6) 

i i<j 

That is, the total correlation function is a weighted sum of the auto- and cross-correlation functions of all components, 
where the weight is the pair fraction. In terms of measurement from pair counts in a galaxy catalog and an auxiliary 
random catalog, it converts to 

tw =E ^;r ,(r) a. <«) 

i<j uv 

where ddij and rr^- are the ij data-data and random-random pairs. The quantity is the overall ij pair fraction (the 
ratio of the total number of ij pairs in the volume to that of all pairs in the volume). Note that, for random pairs, the 
ratio of the number of random ij pairs to that of all random pairs is independent of separation, i.e., = rr^ (r)/iU?(r), 
where RR is the number of random pairs for all galaxies. Therefore, we have the contribution from the ij component 

as 

Ur) ~ — rW) — ' (A8) 

where rr^ / RR is a known quantity given the number density of each component and one only needs to measure dd.^ (r) 
and RR(r). 

An interesting application of the above results to real data is to measure 2PCFs (either projected ones or redshift- 
space ones) of sub-samples of galaxies of a volume- limited sample (e.g., a sample of galaxies divided into blue, green, 
and red galaxy sub-samples). To measure all the two-point auto-correlation functions of galaxies in the sub-samples 
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and their two-point cross-correlation functions, we do not need to construct random catalogs for each sub-sample. We 
only need one random catalog for the whole sample and measure all the correlation functions in a single run based 
on equation (|A8[) . The two-point auto-correlation function for all the galaxies in the whole sample, as the weighted 
sum of the two-point auto- and cross-correlation functions of sub-sam ples (eq.lA6n. is obtai ned for free. To generalize 
equation (|A8|) in the spirit of the widely used Landy-Szalay estimator (jLandv fc S zalav 1993), one can replace —rrij(r) 
with —2drij(r) + rr^ (r). To count drij data-random pairs, one may randomly tag the points in the random catalog 
with component indices according to the fraction of the sub-sample galaxy spatial density in the overall sample. 

Details on the Decomposition of the 2PCF into Central/ Satellite and One-halo /Two-halo Terms 

Following similar reasoning as in § lA.li now let us tag galaxies with two subscripts and decompose the overdensity 
field as 

5 = J2^, (A9) 

where i denotes the ID of the host halo and a is either c (central) or s (satellite). That is, the overdensity is decomposed 
into contributions from central and satellite galaxies from each halo. The random catalog can be obtained by randomly 
redistributing all the galaxies in the volume with their tags untouched. 
In a similar way as before, we can write 5i a as 

S ia = ; X — = 5 ia — . (A10) 

Ui a n n 

It is straightforward to show that £(r) can be formally decomposed as 

f(r)= 5> c (x)Mx + r))3f 



■//- 



■ 5^(<5i C (x)<Mx + r ))" 



^(5 is (x)5 is (x + r))^ 



]>> c (x)Mx + r)>^g^ 

i<j 

TlicTljs 



+ I](Mx)M x + r )> 



X<Mx)M* + r))%^- (All) 

i<j 



It is easy to identify the six terms in the right hand side as contributions from the one-halo cen-cen (which is a 
Dirac Sd function that we are not interested in), the one-halo cen-sat, the one-halo sat-sat, the two-halo cen-cen, the 
two-halo cen-sat, and the two-halo sat-sat pairs, respectively ('cen' for central galaxy and 'sat' for satellite galaxy). 

In terms of measurement, each component can be reduced to the (dd — rr)/RR form. As an example, consider the 
case for the two-halo cen-sat term. Note that fii C = 1/V, n, s = Ni S /V, and h = N/V, where Ni S is the number of 
satellites in the halo of ID i and N is the total number of all galaxies in the volume V. Therefore, the two-halo cen-sat 
contribution is 

= £<M*)M* + r)>^ = E ^ C '" (r) Tr (r) ^ (A12) 

where ddi C j s and rri CJS are numbers of data-data and random-random pairs between galaxies tagged as ic and 
js. One thing to notice is that 2rri C j s (r)/Nj S docs not depend on i and j — it equals rr cs >(r)/N pa i r , cs >, where 
rr cs /(r) = Ylj-j-j ffje.js (r) is the count of all the random "two-halo" cen-sat pairs with separation around r and 
-Npair, cs' is the total number of "two-halo" cen-sat pairs in the volume (cs' denotes a "two-halo" pair). Also noting 
that N 2 /2 = N pSLh ..t tai (N > 1), we then have 

. . dd cs i (r) — rr cs > (r) 1 . , . 

6h, cs (r) = )' /N K ' ^ , ( A13) 

where dd cs i and rr cs < are all the data-data and random-random "two-halo" cen-sat pairs with separation around r. 
We have the following relation between rr cs i and the total number of all random pairs around separation r, i?i?(r), 

nw« = ^( £L _ (AM) 

■L Vpair.cs 7 *p&ii, total 
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Therefore, we end up with 



. . dd cs i (r) — rr cs i (r) . . . 

6 h , cs (r) = Csl 4 (r) CsU - (A15) 
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